Goto

Collaborating Authors

 user 1


Uncertainty-Aware Federated Learning for Cyber-Resilient Microgrid Energy Management

Babayomi, Oluleke, Kim, Dong-Seong

arXiv.org Artificial Intelligence

Maintaining economic efficiency and operational reliability in microgrid energy management systems under cyberattack conditions remains challenging. Most approaches assume non-anomalous measurements, make predictions with unquantified uncertainties, and do not mitigate malicious attacks on renewable forecasts for energy management optimization. This paper presents a comprehensive cyber-resilient framework integrating federated Long Short-Term Memory-based photovoltaic forecasting with a novel two-stage cascade false data injection attack detection and energy management system optimization. The approach combines autoencoder reconstruction error with prediction uncertainty quantification to enable attack-resilient energy storage scheduling while preserving data privacy. Extreme false data attack conditions were studied that caused 58% forecast degradation and 16.9\% operational cost increases. The proposed integrated framework reduced false positive detections by 70%, recovered 93.7% of forecasting performance losses, and achieved 5\% operational cost savings, mitigating 34.7% of attack-induced economic losses. Results demonstrate that precision-focused cascade detection with multi-signal fusion outperforms single-signal approaches, validating security-performance synergy for decentralized microgrids.


Commonsense Generation and Evaluation for Dialogue Systems using Large Language Models

Estecha-Garitagoitia, Marcos, Zhang, Chen, Rodríguez-Cantelar, Mario, D'Haro, Luis Fernando

arXiv.org Artificial Intelligence

This paper provides preliminary results on exploring the task of performing turn-level data augmentation for dialogue system based on different types of commonsense relationships, and the automatic evaluation of the generated synthetic turns. The proposed methodology takes advantage of the extended knowledge and zero-shot capabilities of pretrained Large Language Models (LLMs) to follow instructions, understand contextual information, and their commonsense reasoning capabilities. The approach draws inspiration from methodologies like Chain-of-Thought (CoT), applied more explicitly to the task of prompt-based generation for dialogue-based data augmentation conditioned on commonsense attributes, and the automatic evaluation of the generated dialogues. To assess the effectiveness of the proposed approach, first we extracted 200 randomly selected partial dialogues, from 5 different well-known dialogue datasets, and generate alternative responses conditioned on different event commonsense attributes. This novel dataset allows us to measure the proficiency of LLMs in generating contextually relevant commonsense knowledge, particularly up to 12 different specific ATOMIC [10] database relations. Secondly, we propose an evaluation framework to automatically detect the quality of the generated dataset inspired by the ACCENT [26] metric, which offers a nuanced approach to assess event commonsense. However, our method does not follow ACCENT's complex eventrelation tuple extraction process. Instead, we propose an instruction-based prompt for each commonsense attribute and use state-of-the-art LLMs to automatically detect the original attributes used when creating each augmented turn in the previous step. Preliminary results suggest that our approach effectively harnesses LLMs capabilities for commonsense reasoning and evaluation in dialogue systems.


An Agent-Based Modeling Approach to Free-Text Keyboard Dynamics for Continuous Authentication

Dillon, Roberto, Arushi, null

arXiv.org Artificial Intelligence

Continuous authentication systems leveraging free - text keyboard dynamics offer a promising additional layer of security in a multifactor authentication setup that can be used in a transparent way with no impact on user experience. This study investigates t he efficacy of behavioral biometrics by employing an Agent - Based Model (ABM) to simulate diverse typing profiles across mechanical and membrane keyboards. Specifically, we generated synthetic keystroke data from five unique agents, capturing features relat ed to dwell time, flight time, and error rates within sliding 5 - second windows updated every second. Two machine learning approaches, One - Class Support V ector Machine (OC - SVM) and Random Forest (RF), were evaluated for user verification. Results revealed a stark contrast in performance: while One - Class SVM failed to differentiate individual users within each group, Random Forest achieved robust intra - keyboard user recognition (Accuracy > 0.7) but struggled to generalize across keyboards for the same user, h ighlighting the significant impact of keyboard hardware on typing behavior. These findings suggest that: (1) keyboard - specific user profiles may be necessary for reliable authentication, and (2) ensemble methods like RF outperform One - Class SVM in capturing fine - grained user - specific patterns. Keywords: keyboard dynamics, continuous authentication, agent - based modeling, One - Class SVM, Random Forest, behavioral biometrics.


Improving Network Threat Detection by Knowledge Graph, Large Language Model, and Imbalanced Learning

Zhang, Lili, Zhu, Quanyan, Ray, Herman, Xie, Ying

arXiv.org Machine Learning

Network threat detection has been challenging due to the complexities of attack activities and the limitation of historical threat data to learn from. To help enhance the existing practices of using analytics, machine learning, and artificial intelligence methods to detect the network threats, we propose an integrated modelling framework, where Knowledge Graph is used to analyze the users' activity patterns, Imbalanced Learning techniques are used to prune and weigh Knowledge Graph, and LLM is used to retrieve and interpret the users' activities from Knowledge Graph. The proposed framework is applied to Agile Threat Detection through Online Sequential Learning. The preliminary results show the improved threat capture rate by 3%-4% and the increased interpretabilities of risk predictions based on the users' activities.


Intermittent Semi-working Mask: A New Masking Paradigm for LLMs

Lu, Mingcong, Zhu, Jiangcai, Hao, Wang, Li, Zheng, Zhang, Shusheng, Shao, Kailai, Chen, Chao, Li, Nan, Wang, Feng, Lu, Xin

arXiv.org Artificial Intelligence

Multi-turn dialogues are a key interaction method between humans and Large Language Models (LLMs), as conversations extend over multiple rounds, keeping LLMs' high generation quality and low latency is a challenge. Mainstream LLMs can be grouped into two categories based on masking strategy: causal LLM and prefix LLM. Several works have demonstrated that prefix LLMs tend to outperform causal ones in scenarios that heavily depend on historical context such as multi-turn dialogues or in-context learning, thanks to their bidirectional attention on prefix sequences. However, prefix LLMs have an inherent inefficient training problem in multi-turn dialogue datasets. In addition, the attention mechanism of prefix LLM makes it unable to reuse Key-Value Cache (KV Cache) across dialogue rounds to reduce generation latency. In this paper, we propose a novel masking scheme called Intermittent Semi-working Mask (ISM) to address these problems. Specifically, we apply alternate bidirectional and unidirectional attention on queries and answers in the dialogue history. In this way, ISM is able to maintain the high quality of prefix LLM and low generation latency of causal LLM, simultaneously. Extensive experiments illustrate that our ISM achieves significant performance.


A Dialogue Game for Eliciting Balanced Collaboration

Jeknić, Isidora, Schlangen, David, Koller, Alexander

arXiv.org Artificial Intelligence

Collaboration is an integral part of human dialogue. Typical task-oriented dialogue games assign asymmetric roles to the participants, which limits their ability to elicit naturalistic role-taking in collaboration and its negotiation. We present a novel and simple online setup that favors balanced collaboration: a two-player 2D object placement game in which the players must negotiate the goal state themselves. We show empirically that human players exhibit a variety of role distributions, and that balanced collaboration improves task performance. We also present an LLM-based baseline agent which demonstrates that automatic playing of our game is an interesting challenge for artificial systems.


To Trust or Not to Trust: Towards a novel approach to measure trust for XAI systems

Miró-Nicolau, Miquel, Moyà-Alcover, Gabriel, Jaume-i-Capó, Antoni, González-Hidalgo, Manuel, Campello, Maria Gemma Sempere, Sancho, Juan Antonio Palmer

arXiv.org Artificial Intelligence

The increasing reliance on Deep Learning models, combined with their inherent lack of transparency, has spurred the development of a novel field of study known as eXplainable AI (XAI) methods. These methods seek to enhance the trust of end-users in automated systems by providing insights into the rationale behind their decisions. This paper presents a novel approach for measuring user trust in XAI systems, allowing their refinement. Our proposed metric combines both performance metrics and trust indicators from an objective perspective. To validate this novel methodology, we conducted a case study in a realistic medical scenario: the usage of XAI system for the detection of pneumonia from x-ray images.


Coding for Gaussian Two-Way Channels: Linear and Learning-Based Approaches

Kim, Junghoon, Kim, Taejoon, Das, Anindya Bijoy, Hosseinalipour, Seyyedali, Love, David J., Brinton, Christopher G.

arXiv.org Artificial Intelligence

Although user cooperation cannot improve the capacity of Gaussian two-way channels (GTWCs) with independent noises, it can improve communication reliability. In this work, we aim to enhance and balance the communication reliability in GTWCs by minimizing the sum of error probabilities via joint design of encoders and decoders at the users. We first formulate general encoding/decoding functions, where the user cooperation is captured by the coupling of user encoding processes. The coupling effect renders the encoder/decoder design non-trivial, requiring effective decoding to capture this effect, as well as efficient power management at the encoders within power constraints. To address these challenges, we propose two different two-way coding strategies: linear coding and learning-based coding. For linear coding, we propose optimal linear decoding and discuss new insights on encoding regarding user cooperation to balance reliability. We then propose an efficient algorithm for joint encoder/decoder design. For learning-based coding, we introduce a novel recurrent neural network (RNN)-based coding architecture, where we propose interactive RNNs and a power control layer for encoding, and we incorporate bi-directional RNNs with an attention mechanism for decoding. Through simulations, we show that our two-way coding methodologies outperform conventional channel coding schemes (that do not utilize user cooperation) significantly in sum-error performance. We also demonstrate that our linear coding excels at high signal-to-noise ratios (SNRs), while our RNN-based coding performs best at low SNRs. We further investigate our two-way coding strategies in terms of power distribution, two-way coding benefit, different coding rates, and block-length gain.


Faithful Persona-based Conversational Dataset Generation with Large Language Models

Jandaghi, Pegah, Sheng, XiangHai, Bai, Xinyi, Pujara, Jay, Sidahmed, Hakim

arXiv.org Artificial Intelligence

High-quality conversational datasets are essential for developing AI models that can communicate with users. One way to foster deeper interactions between a chatbot and its user is through personas, aspects of the user's character that provide insights into their personality, motivations, and behaviors. Training Natural Language Processing (NLP) models on a diverse and comprehensive persona-based dataset can lead to conversational models that create a deeper connection with the user, and maintain their engagement. In this paper, we leverage the power of Large Language Models (LLMs) to create a large, high-quality conversational dataset from a seed dataset. We propose a Generator-Critic architecture framework to expand the initial dataset, while improving the quality of its conversations. The Generator is an LLM prompted to output conversations. The Critic consists of a mixture of expert LLMs that control the quality of the generated conversations. These experts select the best generated conversations, which we then use to improve the Generator. We release Synthetic-Persona-Chat, consisting of 20k conversations seeded from Persona-Chat. We evaluate the quality of Synthetic-Persona-Chat and our generation framework on different dimensions through extensive experiments, and observe that the losing rate of Synthetic-Persona-Chat against Persona-Chat during Turing test decreases from 17.2% to 8.8% over three iterations.


Federated Multilinear Principal Component Analysis with Applications in Prognostics

Zhou, Chengyu, Su, Yuqi, Xia, Tangbin, Fang, Xiaolei

arXiv.org Machine Learning

The use of tensors is progressively widespread in the realms of data analytics and machine learning. As an extension of vectors and matrices, a tensor is a multi-dimensional array of numbers that provides a means to represent data across multiple dimensions. As an illustration, Figure 1 shows an image stream that can be seen as a three-dimensional tensor, where the first two dimensions denote the pixels within each image, while the third dimension represents the distinct images in the sequence. One of the advantages of representing data as a tensor, as opposed to reshaping it into a vector or matrix, lies in its ability to capture intricate relationships within the data, especially when interactions occur across multiple dimensions. For instance, the image stream depicted in Figure 1 exhibits a spatiotemporal correlation structure. Specifically, pixels within each image have spatial correlation, and pixels at the same location across multiple images are temporally correlated. Transforming the image stream into a vector or matrix would disrupt the spatiotemporal correlation structure, whereas representing it as a three-dimensional tensor preserves this correlation. In addition to capturing intricate relationships, other benefits of using tensors include compatibility with multi-modal data (i.e., accommodating diverse types of data in a unified structure) and facilitating parallel processing (i.e., enabling the parallelization of operations), etc. As a result, the volume of research in tensor-based data analytics has been rapidly increasing in recent years (Shen et al., 2022; Gahrooei et al., 2021; Yan et al., 2019; Hu et al., 2023; Zhen et al., 2023; Zhang et al., 2023).